Distribution-based Microdata Anonymization

نویسندگان

  • Nick Koudas
  • Divesh Srivastava
  • Ting Yu
  • Qing Zhang
چکیده

Before sharing to support ad hoc aggregate analyses, microdata often need to be anonymized to protect the privacy of individuals. A variety of privacy models have been proposed for microdata anonymization. Many of these models (e.g., -closeness) essentially require that, after anonymization, groups of sensitive attribute values follow specified distributions. To support such models, in this paper we study the problem of transforming a group of sensitive attribute values to follow a certain target distribution with minimal data distortion. Specifically, we develop and evaluate a novel methodology that combines the use of sensitive attribute permutation and generalization with the addition of fake sensitive attribute values to achieve this transformation. We identify metrics related to accuracy of aggregate query answers over the transformed data, and develop efficient anonymization algorithms to optimize these accuracy metrics. Using a variety of data sets, we experimentally demonstrate the effectiveness of our techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Identification of Property Based Generalizations in Microdata Anonymization

Majority of the search algorithms in microdata anonymization restrict themselves to a single privacy property and a single criteria to optimize. The solutions obtained are therefore of limited application since adherence to multiple privacy models is required to impede different forms of privacy attacks. Towards this end, we propose the concept of a property based generalization (PBG) to captur...

متن کامل

An Algorithm for k-Anonymity-Based Fingerprinting

The anonymization of sensitive microdata (e.g. medical health records) is a widely-studied topic in the research community. A still unsolved problem is the limited informative value of anonymized microdata that often rules out further processing (e.g. statistical analysis). Thus, a tradeoff between anonymity and data precision has to be made, resulting in the release of partially anonymized mic...

متن کامل

Anonymity: A Formalization of Privacy - `-Diversity

Anonymization of published microdata has become a very important topic nowadays. The major difficulty is to publish data of individuals in a manner that the released table both provides enough information to the public and prevents disclosure of sensitive information. Therefore, several authors proposed definitions of privacy to get anonymous microdata. One definition is called k-Anonymity and ...

متن کامل

Anonymity: Formalisation of Privacy – k-anonymity

Microdata is the basis of statistical studies. If microdata is released, it can leak sensitive information about the participants, even if identifiers like name or social security number are removed. A proper anonymization for statistical microdata is essential. K-anonymity has been intensively discussed as a measure for anonymity in statistical data. Quasi identifiers are attributes that might...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2009